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Abstract 



Because a significant portion of U.S. students lacks critical mathematic skills, schools across the 
country are investing heavily in computerized curriculums as a way to enhance education output, 
even though there is surprisingly little evidence that they actually improve student achievement. 
In this paper we present results from a randomized study in three urban school districts of a well- 
defined use of computers in schools: a popular instructional computer program which is 
designed to teach pre-algebra and algebra. We assess the impact of the program using statewide 
tests that cover a range of math skills and tests designed specifically to target pre-algebra and 
algebra skills. We find that students randomly assigned to computer-aided instruction score at 
least 0.17 of a standard deviation higher on a pre-algebra/algebra test than students randomly 
assigned to traditional instruction. We hypothesize that the effectiveness arises from increased 
individualized instruction as the effects appear larger for students in larger classes and those in 
classes in which students are frequently absent. 




L Introduction 

Mathematical achievement is arguably critical both to individuals and to the future of the 
U.S. economy. For example, research by Grogger (1996) and Mumane, Willet, and Levy (1995) 
suggests that math skills may account for a large portion of wage inequality including the 
African-American-white wage gap. And yet, in spite of recent progress, levels of proficiency 
remain dramatically low (U.S. Dept, of Education, 2006 - National Assessment of Educational 
Progress (NAEP) report). Compounding the problem of poor mathematics performance is the 
fact that many school districts report difficulty recruiting and retaining teachers, particularly in 
the fields of math and science, where schools must compete with (non-education) private sector 
salaries (Mumane and Steele 2007). While the evidence on the importance of teacher 
qualifications on student achievement is mixed in many subjects, the students of more qualified 
math teachers appear to perform better (See, e.g., Braswell et al. 2001, Boyd et al 2007). 

In response policymakers, parents, and schools are actively seeking creative and effective 
approaches to improving students’ math skills. And, not surprisingly, many school districts are 
turning to advances in computer technology. By 2003 nearly all public schools had access to the 
internet, and the number of public school students per instmctional computer with internet 
access had fallen from 12.1 in 1998 to 4.4.* Despite this trend, research on the success of 
computer technology in the classroom has yielded mixed evidence at best. In economics most 
studies have focused on the impact of subsidies for schools to invest in computer technology. 
Eor example, Angrist and Eavy (2002) show a decrease in math achievement among 8* graders 
after the introduction of a computer adoption program in Israeli schools. Goolsbee and Guryan 
(2006) study the impact of the E-rate - a program to subsidize school investment in the internet - 



' Table 416 of the Digest of Education Statistics: 2006. 
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and conclude that while it has substantially inereased internet investment, it has had no 
signifieant impaet on student aehievement thus far. In eontrast, Maehin, MeNally, and Silva 
(fortheoming) find that a government program to encourage investment in information and 
eomputer teehnology in sehools in the United Kingdom led to improved performanee in English 
and possibly seienee but not in math in primary sehools. While it is important to understand 
whether and how publie subsidies are used and whether they aehieve their intended goals, 
beeause the use of eomputers by the schools in these studies is either unknown or vaguely 
defined, they do not provide direet evidenee on the effeetiveness of eomputer teehnology as an 
input in the edueation produetion funetion. 

Other literature has studied the impaet of eomputer teehnology on student aehievement 
more direetly.^ A relatively reeent study of the NELS88 data showed that multimedia and 
ealeulating aids had a strong positive eorrelation with math aehievement while it had little to no 
effeet in any other subjeet (Wang, Wang, and Ye 2002). In eontrast, Wenglinsky (1998) finds 
that, on average, eomputer use in math instruetion is negatively related to student math 
achievement in the 8* grade. A potential problem with this seeond group is that there are few 
studies that use a randomized eontrolled study design, or employ a eredible strategy for 
eontrolling for faetors sueh as individual teaeher effeets and student ability, that might be 



^ Kirkpatriek and Cuban (1998) define three uses of eomputers in instruetion: eomputer- 
assisted instruetion (C At), eomputer-managed instruetion (CMI), and eomputer-enhaneed instruetion 
(CEI). CAI provides drill exereises and tutorials. CMI is more elaborate in diagnosing areas in 
whieh students need more instruetion, guiding students in their own learning, and reeording progress 
for the teaeher. CEI uses the Internet or other eomputer programs, sueh as graphies or word- 
proeessing, to enhanee lessons and projeets direeted by the teaeher. The type of eomputerized 
instruetion we study is best eharaeterized as eomputer-aided instruetion, although it also eontains 
elements of eomputer-managed instruetion. We use the terms eomputer-aided instruetion and 
eomputerized instruetion interehangeably. 
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correlated with both use of eomputers in the elassroom and student outeomes.^ For example, 
given that eomputer teehnology may be used either to help poorly performing students or to 
enhanee the learning of high achievers, it is unelear whether seleetion bias would generate 
upward or downward biased estimates of the average impaet of eomputers teehnology on student 
aehievement in poorly designed studies. 

Three notable exeeptions inelude a randomized evaluation of eomputer-assisted 
instruetion eonducted in the late 1970s by the Edueational Testing Serviee and the Los Angeles 
Unified Sehool Distriet that eonsisted of drill and praetiee sessions in mathematies, reading, and 
language arts (Ragosta et ah, 1982); the study found edueationally large effects in math and 
reading. More reeently, using a randomized study design Banerjee et al. (2005) eonelude that 
eomputer-assisted mathematies instruetion boosted the math seores of fourth-grade students in 
Vadodara, India. In eontrast, after randomly assigning students to be trained using a eomputer 
program known as Fast ForWord, whieh is designed to improve language and reading skills. 
Rouse and Krueger (2004) eonelude that while use of the eomputer program may have improved 
some aspeets of students’ language skills, such gains did not appear to translate into a broader 
measure of language aequisition or into aetual reading skills. Overall, one ean eonelude that this 
literature is also mixed, although there may be more support for the effeetiveness of eomputer 
teehnology in the instruetion of math than in reading. Notably, however, few studies offer 



^ In an oft-eited, and somewhat eontroversial, review of the literature, Cuban (2001) 
eoneludes, “When it eomes to higher teaeher and student produetivity and a transformation of 
teaehing and learning . . . there is little ambiguity. Both must be tagged as failures. Computers have 
been oversold and underused, at least for now.” (p. 179). Others argue for a more nuaneed view of 
the literature that eomputers ean be effeetive in eertain situations, sueh as when used by teaehers 
with skill and experienee in using eomputers themselves (see, e.g.. Brooks (2000)). 
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evidence on why the technology may help or hinder student achievement and the most recent 
evidence for math may not apply to U.S. students. 

In this paper we present results from a new randomized study in three urban school 
districts in the U.S. of a well-defined use of computers in schools: a popular instructional 
computer program which is designed to improve pre-algebra and algebra skills. We assess the 
impact of the program using both statewide tests that cover a range of math skills and tests 
designed specifically to target pre-algebra and algebra skills. We find that students randomly 
assigned to classes using the computer lab score at least 0.17 of a standard deviation higher on 
tests of pre-algebra and algebra achievement than students assigned to traditional classrooms. 
The estimated effect rises to 0.25 of a standard deviation when we estimate the effect for 
students who actually use the computer-aided instruction. We find some evidence for the 
hypothesis that the effectiveness arises from increased individualized instruction as the effects 
appear larger for students in larger classes and those in classes in which students have poor 
attendance records. 

In the next section we discuss why and in which circumstances CAI may be more 
effective than traditional instruction. Section III presents the empirical model, research design 
and data. Section IV presents the results, in Section V we evaluate the cost effectiveness of CAI, 
and Section VI concludes. 

II, Why Might CAI Be More Effective than Traditional Instruction? 

A key question is why CAI may be more effective than traditional classroom teaching, on 
average. Some classroom research suggests computers can offer highly individualized 
instruction and allow students to learn at their own pace (e.g. Lepper and Gurtner 1989, Means 
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and Olson 1995, Sandholz et al 1997, Heath and Ravits 2001). While we do not have a direet 
test, we hypothesize that if CAI allows for more individualized instruetion, then it may be more 
benefieial for struggling students who eannot keep up with the paee of the leetures in traditional 
elassrooms or for more advaneed students who eould progress faster at their own paee."* Further, 
we might expeet CAI to be more effeetive for students with poorer rates of attendanee. In a 
traditional elassroom, students missing elass will miss all of the material eovered in elass that 
day. In eontrast, the eomputer always pieks up where the student left off the last time she was in 
elass regardless of whether it was the day before or 5 days before. Similarly, in elasses in whieh 
many students have poor attendanee reeords or in larger elasses, we might expeet a bigger effeet 
of CAI as teaehers would struggle to find the appropriate level at whieh to piteh leetures. 
Finally, one might think that individualized instruetion provided by CAI avoids some of the 
disruption effeets of having peers with poor attendanee rates or being in larger elasses as 
modeled by Lazear (2001). 

More formally we ean follow Brown and Saks (1984) and think of the teaeher as 
alloeating elass time to different types of instruetion. In the traditional elassroom, the teaeher 
divides elass time between group instruetion time, Tq, and individual instruetion time, T^, sueh 
that. 



To*jT,iT. ( 1 ) 



^ Other forms of self-paeed instruetion may offer a similar edueational advantage. However, 
a very small, older, literature suggests that eomputerized self-paeed instruetion is more effeetive 
than other self-paeed instruetion. See, e.g., Enoehs, Handley, and Wollenberg (1986) and Surber 
et al (1977) for randomized studies involving eollege-age students. 
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where T is the total class time available. Thus, total instruction time for student i equals 

Tfj + % <T . ( 2 ) 

As long as other students in the class receive some individual instruction time, the total 
instruction time for student i is strictly less than the total class time available. 

In the CAI classroom, the teacher also allocates class time between group and individual 
instruction, but computer-aided instruction effectively increases the productivity of individual 
instruction time. Namely, while the teacher spends time working with student j, student i can be 
working on the computer and receiving additional instruction. In contrast to individual 
instruction time, student i can receive an additional minute of CAI time (C,) without reducing the 
total amount of instruction time available to student j. Total instruction time for student i equals 

+ % + (3) 



Let student achievement, be a function of instruction time and individual 
characteristics, so that 

(4) 

and /i k 0, > 0, and /j > 0. Since f-^ > 0, student fs achievement in the CAI classroom will 

be greater than or equal to student fs achievement in the traditional classroom for any given 
allocation of T, and Ta, i.e.. 






( 5 ) 
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Note that the relative advantage of computerized instruction will depend on the 
suitability of the curriculum for the students in question which will affect the magnitude of f^. 
Suppose further that the teacher maximizes her utility by allocating each student the same 
amount of individual instruction time. For a class of students, 



% < 



I 7 - Tg t 



( 6 ) 



Thus, for a given time allocation to group instruction, 7g, 7, decreases as class size increases. In 
the CAI class this means thatCj < ^ ^ i T - Tq\ so the potential gain in total instruction time 



for student i of moving from a traditional class to a CAI class is increasing in class size A. 

Similarly, one might assume instead that individual instruction time (or at least some of 
it) is non-productive and related to the teacher needing to deal with individual student behavioral 
problems. Assuming that student y’s disruptive behavior reduces group instruction time and/or 
individual instruction time but does not also disrupt student fs ability to work on the computer, 
the gain in total instruction time for student i of moving from a traditional class with a disruptive 
student to a CAI class with a disruptive student is greater than the gain from changing classroom 
types with a class with no disruptive students. 



Ill, Evaluating Computer-Aided Instruction (CAI) 
A. The Empirical Model 
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The primary research question we examine is whether mathematics instruction is more 
effective when delivered via computer programs or using traditional (“chalk and talk”) methods. 
In designing the study, we were concerned about two sources of bias that might arise using 
observational data in which we simply compared the outcomes of students taught using CAI to 
those taught using more traditional methods. The first is that principals and/or teachers may 
choose to put students they believed would particularly benefit from computerized instruction 
into the labs. This bias would overstate the effect of CAI relative to traditional instruction. 

A second source of bias is that more (or less) motivated teachers may be more willing to 
try computerized instruction than their less (or more) motivated peers who would prefer to 
continue teaching using traditional methods. Thus, a key concern with the existing literature on 
the effectiveness of computer-aided instruction is that the students taught by teachers willing to 
teach using the computerized instruction would have outperformed their classmates who were 
taught by other teachers, regardless of whether or not the students had been in the computer lab. 
That is, the previous researchers may have confounded a teacher effect with the effectiveness of 
the computer program. 

To control for both types of selection bias, we implemented a within-school random 
assignment design at the classroom level. We randomly assigned classrooms of students (in 
which the classroom is the group of students taught by a particular teacher during a particular 
class period in a particular school) to be taught in the computer lab or using “chalk and talk.”^ 

^ Note that randomly assigning students to be taught in the computer lab or not answers a 
slightly different question: whether being taught in the computer lab - regardless of how classes 
are typically formed within schools - would generate improvement relative to traditional instruction. 
Our approach of randomly assigning classes comes much closer to the policy question faced by 
school principals and superintendents, which is whether instruction for a particular class should 
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Because classes (with the assigned teacher) will have been randomly assigned, the observed - 
and unobserved - characteristics of the students and teachers assigned to the computer lab will 
be identical to those that were not, on average. 

Our first empirical model that takes advantage of the randomization generates estimates 
of the “intent to treat” effect of using computerized instruction. In these models, the test scores 
of students in classes randomly assigned to the computer lab are compared to the test scores of 
students in classes randomly assigned to the control group, whether or not the students remained 
in their original class assignments. To estimate the intent-to-treat effect, we estimate ordinary 
least squares (OLS) regressions of the following model: 

Yikj = a + X;P + yRikj + Pj + £ikj (7) 

where Yij,j represents student i with teacher k in period y’s score on one of the follow-up tests, 
indicates whether the student was assigned to a class that was randomly assigned to a computer 
lab, X; represents a vector of student characteristics (including, in most specifications, the 
student’s baseline test scores), pj is the randomization poofi, 6;^ is a random error term, and a, P, 
and Y represent coefficients to be estimated. The coefficient y represents the “intent to treat” 



occur in the computer lab or in a traditional classroom. We also note that it would be a logistical 
nightmare to randomly assign students and teachers to classes at the middle or high school level 
irrespective of their other classroom scheduling needs. That said, the districts in which we 
conducted this study all use computer software to assign students to classes and they claim this 
assignment is basically random, as discussed in footnote 1 1 below. 

® As described below, in most cases the randomization pool is the class period of the class 
(within a particular school). 
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effect and estimates the effect of assigning students to be taught using CAI on the outcome in 
question. 

As noted, above, because we randomly select classrooms, our research strategy should 
generate estimates of the intent-to-treat effect that are not affected by potential self-selection of 
teachers into the lab. However, this is only strictly true in large samples and so one might also 
be concerned that - by chance - the more (or less) motivated teachers ended up in the computer 
lab. If more motivated teachers ended up being selected to teach in the computer lab, then OLS 
estimates of the effect of CAI on student outcomes will be biased upwards. One could control 
for this bias by comparing the achievement of students with teachers who teach both in and out 
of the lab. That is, one can control for a teacher fixed effect. Indeed, in their meta analysis of 
the research, Kulik and Kulik (1991) concluded that studies in which the same teacher taught 
both the computer-aided class and the comparison class, the differences in achievement were 
much lower than when the two types of classes had different teachers which is consistent with 
teacher selection bias. 

At the same time, this result - that the effect of CAI is lower in the presence of teacher 
fixed effects - would also obtain if there are spillovers in teaching techniques such that teachers 
import lessons learned from the lab to their traditional classes. In this case, the spillover will 
attenuate the estimated impact of computerized instruction. In our study some of the 
participating teachers taught both in a computer lab and using traditional methods while others 
taught exclusively in the lab or exclusively out of the lab.^ This variation allows us to control for 

^ An issue that can arise in studies of this kind is that the teachers and associated staff are 
unfamiliar with the intervention and therefore not properly trained to use it effectively. All three 
districts had been using this CAI program on a small scale before our study began (Districts 2 and 
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the quality of the teaeher (by ineluding a teaeher fixed effeet) and to eompare results with and 
without the teaeher fixed effeets.® 

A potential problem with the intent-to-treat estimation is that sehool staff may 
“eontaminate” the experiment by assigning students from the eontrol group (or from outside of 
the study) to a CAl lab elass. Or, they may assign students originally in a eomputerized elass to 
a traditionally-taught elass. While throughout the study we emphasized the importanee of 
maintaining the original student assignments and the prineipals and teaehers indieated that they 
understood this importanee, some eontamination did oeeur. While the intent-to-treat effeet 
represents the gains that a polieymaker ean realistically expect to observe with the program 
(since one cannot fully control whether students initially assigned to a class in the lab actually 
remain in that class), it does not necessarily represent the effect of the program for those who 
actually complete it. 

Therefore, we also implement instrumental variables (IV) models in which we used 
whether the student was in a class randomly assigned to a computer lab as an instrumental 
variable for actual participation. The random assignment is correlated with actual participation 
in a computer lab but uncorrelated with the error term in the outcome equation (since it was 



3 for at least one year before our study, and District 1 since 1995), and therefore some of the 
teachers had already been trained and were familiar with the program. Further, all CAl teachers 
received training and support from both the company and district support staff throughout the study. 

* Unfortunately, if we find that the estimated impact of CAl is smaller when we control for 
fixed effects than when we do not, we will not be able to distinguish whether this is due to more 
motivated teachers having been selected to be in the lab or to the existence of spillovers from the 
CAl instruction to traditional instruction. Obviously, if we find that the impact is larger in the 
presence of teacher fixed effects, we might conclude that, at a minimum, the less motivated teachers 
were assigned to the lab, by chance, and that this effect was not outweighed by any potential 
spillovers. 
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determined randomly). In this case, the second-stage (outcome) equation is represented by 
models such as, 



Yikj - c(' + XiP' + dCAlikj + p'j + e'ikj (8) 

where CAlji^j indicates whether the student completed at least one lesson in a computer lab, 6 
indicates the effect of being taught through computerized instruction on student outcomes, and 
the other variables and coefficients are as before. Through the use of instrumental variables one 
can generate a consistent estimate of the effect of computerized instruction on student outcomes. 

Note that random assignment occurred at the classroom level even though we have data 
available for each student. Therefore, we adjust our standard errors to account for the fact that 
the randomization occurred at the classroom level. 

B. Computer-Aided Instruction 

We study the effectiveness of computer-aided instruction by focusing on a group of 
computer programs known as / Can Learn® (or “Interactive Computer Aided Natural Learning”) 
distributed by JRL Enterprises. The system is composed of both a software and hardware 
computer package that is designed to deliver instruction through technology on a one-on-one 
basis to every student; the curricula is designed to meet the National Council of Teachers of 
Mathematics (NCTM) standards. In addition to the interactive teaching system, the software 

® In addition, we have estimated our models using data aggregated to the classroom level, 
and using classroom random effects, with similar results. 
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package also includes a classroom management tool for educators and the company provides on- 
site support for administrators and teachers. 

The CAl program allows students to study math concepts while advancing at their own 
pace, enabling them to spend the necessary time on each subject lesson. Each lesson has five 
independent parts - a pretest, a review (of prerequisites needed for the lesson), the lesson, a 
eumulative review, and comprehensive tests. Students that do not pass the pretest or review are 
made to repeat the lesson until they receive a certain degree of mastery. Each student’s 
performance is recorded in a grade book and teachers can monitor students’ progress through a 
series of reports. The teacher’s role in this environment is to provide targeted help to students 
when they need additional assistance. In addition, the computer program covers many 
administrative aspects such as lesson planning, grading and homework assignment so that 
teachers may spend more time on individual instruction with struggling students. Previous 
quasi-experimental studies of the effectiveness of this group of computer programs have yielded 
mixed results (see, e.g. Brooks 2000, Kerstyn 2001, Kirby 1995, and Kirby 2004). 

C. The Research Design 

1 . The Sites 

We eonducted the study in three large urban school districts: one in the northeast, one in 
the midwest and one located in the south. Each of these districts had slightly different 
demographics but suffer similar problems in the areas of underachievement and teacher 
recruitment. As shown in Table 1, these distriets have a high proportion of minority students 
who are considerably poorer than the national average District 1 has a student enrollment of 
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nearly 68,000 students; 94 pereent of whom are Afriean American and 1% percent of whom are 
Hispanic. District 2 serves just over 22,000 students; 40% of whom are African American and 
54% of whom are Hispanic. District 3 serves approximately 97,000 students, 59% of whom are 
African American and 18% of whom are Hispanic. 

2. Implementation 

To implement our randomized design, near the beginning of the academic year the 
participating schools provided us with their schedule of pre-algebra and algebra classes.'” We 
then randomly selected the treatment classes (taught using CAI) and the control classes (taught 
traditionally). Officials in the schools were not informed of the outcome of our randomization 
until they had finished assigning students to classes to protect against students being assigned to 
classes on the basis of whether it would be taught using traditional methods or in the computer 
lab." Once students were assigned to classes, we informed the schools which classes should use 
CAI and which should be taught using a traditional method. 

We conducted the study during the 2004-2005 school year in 8 high schools and 2 middle 
schools in District 1; and during the 2003-2004 school year in 4 high schools in District 2 and in 



The schools were given the option of eliminating particular teachers and/or classes from 
the study before the randomization. The extent to which the schools exercised this option varied. 

" That said, the schools claimed that the process by which they assigned students was 
basically random. We have assessed this claim by comparing the standard deviation of baseline test 
scores within the observed classes with the mean standard deviation that one would obtain if 
students were assigned to classes randomly (within a particular level). Consistent with the schools’ 
claims, we found that the observed variation in baseline “ability” within classes was similar to that 
which would obtain if students were randomly assigned. Similarly, the spread of baseline test scores 
was much larger than what one would have expected if students were strictly “tracked.” 




15 



3 high schools in District 3. As shown in Table 2, the demographie eharaeteristies of students in 
the sehools in our study in Distriet 1 had a slightly higher pereentage of Afriean Ameriean 
students (97%) eompared to the sehools in the distriet; the study schools in District 2 were 
roughly similar to those in all sehools in the distriet; and the sehools in Distriet 3 had a larger 
pereentage of Afriean Ameriean students (93%) and a smaller percentage of Hispanie students 
(1.2%) eompared to the distriet average. In most oases, the students in the olasses within the 
sehools that partioipated in the study were representative of the students in the schools (with the 
exoeption that in Distriet 1 the average pereentage of students that were Afriean Ameriean in the 
study was smaller than that in the sehools (88% vs. 97%)). 

As shown in Appendix Table 1, our study originally inoluded a total of 17 sehools, 147 
olasses, and 61 teaohers. These 147 olasses were grouped into 60 “randomization pools” whioh 
represented the groups of olasses from whioh we randomly seleoted oandidates for the treatment 
and oontrol groups. These pools mostly represented a olass period, although in a few oases, there 
were not enough olasses from whioh to randomly piok one to go into the lab and so we oombined 
classes from two periods.'^ Beoause of mobility, our analysis sample - which is limited to 
students with follow-up test soores using our main outoome (that on a speoially designed algebra 
test, see below) - is oomprised of 17 sehools, 141 classes, 59 teaohers, and 60 randomization 
pools. 



Typioally there was only one or two oomputer labs in eaoh sohool (one sohool had three 
labs) suoh that there were more math olasses than labs available in any one period. 

When we further limit the sample to students with baseline test soores on our main 
outoome we have 17 sehools, 137 olasses, 57 teaohers, and 60 randomization pools, as shown in 
Appendix Table 1. 
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D. Data 

1 . Academic Outcomes 

We primarily assess the impaet of CAI on student aehievement using test instruments. 
First, we sought an exam that was elosely aligned with the material in the mathematies eourses.'"* 
Thus, we eontraeted with the Northwest Evaluation Assoeiation (NWEA), a non-profit 
organization that has partnered with more that 2,300 sehool distriets (serving more that 2 million 
students) to provide assessments, reports, elassroom resourees and professional development. 
NWEA designed a eustomized paper and peneil exam that targeted speeifie pre-algebra and 
algebra skills outlined in the distriet’s eourse objeetives and the CAI eurrieulum. (In theory, the 
CAI eurrieulum was adapted to meet eaeh distriet eourse objeetives.) NWEA ereated a 30-item 
multiple ehoiee exam for both pre-algebra and algebra. The same exams were ereated for 
Distriets 2 and 3. Slightly different exams were ereated for Distriet 1 to mateh the distriet’s 
standards. However, the exams in Distriet 1 were designed to mateh the exams used in the other 
two distriets to allow for pooled analysis. 

We observe post-test seores for 1,872 students aeross all three distriets (1,165 in Distriet 
1, 477 in Distriet 2, and 230 in Distriet 3). However, in some analyses we also eontrol for the 
student’s pre-test. Thus, in the sample that ineludes both pre- and post-NWEA tests we have 
1,585 students (973 in Distriet 1, 412 in Distriet 2, and 200 in Distriet 3). Eurther, we eonvert 



Note that we did not administer the Terra Nova algebra test, a eommon nationally-normed 
mathematies test, beeause many of the distriet offieials were eoneemed it does not eontain suffieient 
items related to pre-algebra and lower-level algebra. 




17 



the baseline and follow-up test seores to standard deviation units using the standard deviation of 
the baseline test seore.'^ 

We also assess the impaet of CAl using the statewide tests administered by eaeh state. In 
Distriet 1 , we only have post-treatment state test data for the students in the 8* grade; we use the 
district-administered Iowa Test of Basic Skills (ITBS) from the 7* grade as the pretest. At the 
time of our study, students in Districts 2 and 3 were tested in mathematics on state-wide tests in 
4*, 8* and 10* grades. Since in these districts the students in the study were primarily in 9* 
grade, we use the 8* grade statewide test as the pre-test and the 10* grade test as the post-test. 
The mean of the (standardized) baseline statewide test in District 1 is 9.2; that in District 2 is 6.7; 
and that in District 3 is 16.7. Again, the test scores were standardized to have a baseline 
standard deviation of one within each district.'® 



We standardize using the standard deviation of the baseline test score for all students 
across the three districts which is 9.20. We have also used “national” standard deviations which 
range from 16.7 for 8* grade students to 17.4 for grades 10 and higher. Not surprisingly, this cuts 
the estimated effect sizes by roughly one-half. We chose to present the effects using the standard 
deviation within the study for two reasons. First, we have also estimated the effects using “growth 
norm” gains - the effect of CAl on the expected one-year growth in test scores (this norming takes 
into account that initially-low scoring students typically make larger yearly gains than initially 
higher-scoring students). Translated, these estimates are more similar to the effect sizes using the 
district standard deviation than the national standard deviation, reflecting that our sample of students 
are by-and-large initially low achieving. As such, the study standard deviation better reflects the 
population in question. In addition, we only have district (or study) standard deviations for some 
of the outcomes such that the results are more consistently presented across outcomes when we use 
the district or study standard deviation. The results using both the growth-norms and national 
standard deviation are available on request. 

'® Before we standardize the test scores, the standard deviation of the baseline statewide test 
in District 1 is 23.3; that in District 2 is 31.7; and that in District 3 is 39.1. For District 1 we 
standardize the 8* grade follow-up test score using the standard deviation of the 8* grade test for 
the study 9* graders because the pre- and post tests are not the same test. The standard deviation of 
the 9* graders’ 8* grade test is 44.7. 
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In addition, pre-algebra students in Distriet 1 took mini-math exams - benehmark pre- 
algebra exams - throughout the semester. These tests were intended for use by the teaeher and 
distriet to traek students’ progress. The initial benehmark test has a mean of 18.7 and a standard 
deviation of 5.7. We standardize the initial benehmark test to have a standard deviation of one 
and also standardized the 2’'‘* and 3'^'* quarter benehmark tests using the initial test seore standard 
deviation. 

Beeause we do not have a way of standardizing the state tests aeross the distriets, we 
analyze these data separately by distriet. The sample size of students in Distriet 1 with both pre- 
and post-tests is 237; that in Distriet 2 is 341, and that in Distriet 3 is 199. Further, the sample 
size for the benehmark tests in Distriet 1 is 230. We emphasize that while the state tests have the 
advantage of being high-stakes and therefore of great importanee to the distriets, as little as 10% 
of the state exams in mathematies eontain test items related to pre-algebra and/or algebra. As 
sueh, they may have low power to deteet effeets of a pre-algebra/algebra intervention.'’ 

Despite the faet that only a fraetion of the state tests foeuses on pre-algebra and algebra, 
the three test assessments are reasonably highly eorrelated. For example, the eorrelation 
between the baseline NWEA test and the state math tests range from 0.30 (in Distriet 1) to 0.73 
(in Distriet 2). Further in Distriet 1 the eorrelation between the baseline algebra test and the 
baseline benehmark test is 0.57 and that between the state math test and the baseline benehmark 



In one of the distriets we were able to identify individual test items that were related to pre- 
algebra and algebra. Not surprisingly, our estimates were quite noisy given that there were very few 
test items on whieh to measure the students’ performanee. 
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pre-algebra test is 0.62. Thus, while two of our three assessments are not based on nationally 
normed exams, they nonetheless appear to be eorrelated with the high-stakes state tests.'* 

2. Other Data 

The statistieal office in each district also provided us with administrative data on 
students. The data included student identifiers, limited characteristics (such as the student’s sex, 
race/ethnicity, and eligibility for a free or reduced-price lunch). In two of the three districts we 
also obtained data on the number of days the students attended school the previous year and the 
year in which we conducted the study; and we have limited information on in- and out-of-school 
suspensions. In addition, we gauge each student’s engagement with the program and the time- 
on-task through tracking data that comes with the computerized program. Importantly, these 
data allow us to determine which students ever actually trained in the computer lab versus in a 
traditional classroom for the analysis estimating the effect of the treatment on the treated. 

IV, Results 

A. Descriptive Statistics 

The first order of business is to determine if assignment to the computer lab appears 
random. Table 3 shows the mean of student characteristics by whether or not the student’s class 
was assigned to the CAI lab or was assigned to receive traditional instruction. The top panel 



'* For comparison, Figlio and Rouse (2006) report that in a subset of Florida districts the 
correlation between student performance on a nationally-normed test (the NRT) and the FCAT 
curriculum-based assessments (known as the Sunshine State Standards (FCAT-SSS) examinations) 
is approximately 0.8. 
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uses the full sample of students who were randomly assigned at the beginning of the aeademie 
year. We see that the proportion of female, Afriean Ameriean, and Hispanie students are quite 
similar using the full sample. Further, the baseline test seores are identieal. 

However, there is signifieant mobility among students in the distriets sueh that we were 
unable to post-test all of the students. A major eoneem is that the attrition between the 
beginning and end of the study was uneven between the treatment group and the eontrol group 
thereby introdueing statistieal bias into the analysis. We therefore compare the observable 
characteristics of the students in the treatment and control groups using the sample of students 
for whom we also have both the baseline and follow-up data on the NWEA test in the bottom 
panel. Again, there is no difference in the baseline pre-algebra/algebra test score, however there 
are small differences in the percentage of students that are African American and Hispanic that 
are statistically significant at the 6% level.'® As a result, in most specifications we control for 
the sex, race and ethnicity of the student. 

B. Overall Intent- to-Treat and Treatment-on-the-Treated Estimates 

Table 4a presents the OES estimates of the intent- to-treat effects of CAI represented by 
equation (1) as well as an instrumental variables (IV) estimate of the effect of treatment-on-the- 
treated using the NWEA test as an outcome. Column (1) presents the straightforward mean 
difference in the post-test between students learning algebra using CAI and those learning in a 
traditional classroom adjusted only for dummy variables representing the randomization pool. 



We note, however, that these differences in race and ethnicity arise in only one district 
(District 2). 
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The standard errors reported allow for within-elassroom eorrelation. We estimate that, on 
average, students in CAl seored 0.17 of a standard deviation higher on the post-test than did 
those in a traditional classroom, and this difference is statistically significant at the 5% level. 
When we add controls for the sex and race/ethnicity of the student, in column (2), the random 
assignment effect does not change. 

In column (3) we present the same specification as that in column (1) but restrict the 
sample to those students who also had a pre-test. The basic effect of CAl is slightly higher - 
21% of a standard deviation - among the subset of students with baseline test scores, although 
the estimate is within a standard error of that in column (1).^° Note that the coefficient estimate 
falls slightly when we include the baseline test score (columns (4) and (5)), although this 
difference is not statistically different from that in column (3). Thus, we estimate that the effect 
of being placed in a CAl classroom relative to a traditional classroom is an educationally and 
statistically significant 0.17 of a standard deviation. To interpret this effect differently, when we 
use the growth-normed test scores, we find that students assigned to a CAl classroom achieve 
26% of a grade-level more than their peers at the end of the semester. 

However, if some contamination occurred in the study, these OLS estimates will 
understate the potential educational gains by students who are actually taught in the lab. To the 
extent that students assigned to classrooms to be taught using traditional methods spent time in 
the lab and students assigned to the lab did not receive their algebra instruction there, the intent- 
to-treat estimates may be too small. Table 5 shows the number of lessons students were 

Further, when we regress whether the student is missing the baseline test score on a variety 
of student characteristics, none of the characteristics significantly differ between those with and 
without baseline test scores. 
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expected to complete given the course taken; the percentage of students completing no lessons, 
more than 1 0 lessons and more than 20 lessons in the C AI; the number of lessons the student 
actually completed; and the number of lessons completed as a fraction of the CAI course 
expectations by whether the student was assigned to the treatment group or the control group. 

Note, first, that there is no difference in the number of CAI lessons that students would 
have been expected to complete based on the level of their math class and the school’s schedule. 
However, there is evidence of some, although not extensive, contamination. For example, 84% 
of students assigned to the lab completed at least 10 lessons in the lab; 15% of those assigned to 
classes to be taught using traditional instruction completed at least 10 lessons in the lab as well. 
Similarly, while treatment students completed an average of 33 lessons using CAI, the control 
group students completed an average of 5.6 lessons. And, while the treatment students appear to 
have completed about 64% of the lessons they would have been expected to complete using CAI, 
the control students completed 10%. 

We address this contamination by using IV to estimate equation (2), the results of which 
are in column (6). In this specification we identify students who were “treated” as those who 
completed at least one lesson in the computer lab and instrument for this indicator with the 
random assignment of the student’s class. This strategy provides a consistent estimate of the 
effect of “treatment-on-the-treated.” We estimate that students who actually receive instruction 
using CAI score 0.25 of a standard deviation higher than those who received instruction in a 
traditional classroom, and the difference is statistically significant. 

We have used alternative definitions of students receiving treatment, such as whether the 
student completed at least 5 lessons in the lab and whether the student completed at least 1 0 lessons 
in the lab. The results were robust to these alternative definitions. 
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As noted above, although we have nearly 60 teaehers who partieipated in the analysis, we 
also sought to understand whether these impaets result beeause we, by ehanee, seleeted more 
motivated teachers to teach in the lab. Thus, we exploit the fact that just over one-half of the 
teachers taught both in and out of the computer lab and include teacher fixed effects in the 
analysis. These results are presented in Table 4b which is otherwise identical in layout to Table 
4a. The within-teacher coefficient estimates are uniformly greater than those without teacher 
fixed effects. Thus, we estimate that, controlling for (time invariant) teacher quality, the effect 
of being assigned to a computer lab increases student math achievement. The intent-to-treat 
effect is nearly 30% of a standard deviation; when we adjust for non-compliance using IV the 
effect of CAI increases to 40% of a standard deviation. These effects are educationally large and 
statistically significant and (translated) suggest that students who actually completed lessons in 
the lab gained roughly 50 percent of a year more than those taught in a traditional classroom. 

We next consider whether we detect similar effects of CAI on student math achievement 
using other math test instruments. Because these instruments were not standardized across the 
districts, we present the results separately by district. Table 6a shows the intent-to-treat effect of 
CAI in which we use four outcomes in District I. The first (column (1)) is the pre- algebra and 
algebra test developed by NWEA that was also used as the outcome in Tables 4a and 4b; the 
second and third are the second and third quarter benchmark tests conducted by the district 



Part of the reason for the larger estimated coefficients in Table 4b derive from the fact that 
the intent-to-treat effect of CAI is larger when we limit the sample to the subset of teachers who 
taught both in- and out- of the lab (i.e., those observations from which the fixed effects analysis is 
identified). When we conduct the analysis on this subsample of teachers and do not include teacher 
fixed effects the intent-to-treat effect (similar to that in column (4) in Table 4a) is 0.27 and the IV 
estimate (similar to that in column (6) in Table 4a) is 0.44. 
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(columns (2) and (3)); and the final column (column (4)) is the statewide math test. We present 
the results in two panels: the top panel uses the maximum available sample for each outcome 
and the lower panel constrains the sample to be constant across them. 

In District 1, when we allow for the maximum possible sample, the intent-to-treat effect 
using the NWEA pre-algebra/algebra test is approximately 0.23 of a standard deviation. We see 
a larger gain of 0.4 of a standard deviation using the 2“‘* quarter benchmark test and a gain of 0.6 
of a standard deviation using the 3'^‘* quarter benchmark test. Importantly, we also detect an 
effect of 0.26 on the state mathematics test. All of these gains are educationally large and 
statistically significant at the 5% level. Further, the coefficient estimates in the bottom panel 
suggest that the gains are not simply driven by changes in the sample size across the 
specification as they are even larger. 

Analogous results for Districts 2 and 3 are presented in Table 6b (note that benchmark 
tests were not administered in these districts). Columns (1) and (3) show the effect of CAI using 
the NWEA test; those in columns (2) and (4) report the effect using the statewide test for each of 
the districts. In District 2 we detect an effect of 0.2 of a standard deviation using the algebra test 
with a p-value of 0.13; the effect is much smaller on the state test - less than 10% of a standard 
deviation - and not statistically different from zero. That said, these results are not unexpected 
given that most of the state math test is not geared towards pre-algebra and algebra. Note that 
the results do not appear to depend on whether or not the sample is restricted to be the same in 
both specifications. In contrast, we estimate a negative intent-to-treat effect of CAI on student 
achievement in District 3 using both the NWEA test and the state math test, although neither 
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coefficient estimate is statistically different from zero (in fact the standard errors are much larger 
than the coefficient estimates). 

While the magnitude of the intent-to-treat effect is largest in District 1 , the effect (based 
on the algebra test) is not statistically distinguishable from that in District 2.^"* Further, we note 
that the negative effect in District 3 is driven by the results from only one randomization pool. If 
we exclude this pool from the analysis the point estimate in column (3) of the top panel of Table 
6b rises to 24 percent of a standard deviation and that in column (4) rises to 15 percent of a 
standard deviation. These estimates are not statistically different from those estimated in 
Districts 1 and 2.^^ In addition, in the districts in which CAI appears most effective, the test 
improves student achievement on more than simply one math test. 

C. Empirical Evidence on Why is CAI More Effective 

The discussion in Section II suggested that CAI may more effective for some students 
than others and in classes in which individualized instruction may be particularly advantageous. 
In the following tables, we look for patterns of impacts that are consistent with this 



We have also estimated IV models by district for all of the outcomes. In general the 
coefficient estimates are larger but not qualitatively different from the OES estimates. These results 
are available on request. 

This inference is based on a combined regression in which we interact the intent-to-treat 
effect with dummy variables indicating the school district. 

The subsequent results are qualitatively similar with or without this one randomization 
pool in District 3 . A complete set of results without the randomization pool are available on request. 
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interpretation.^*’’^’ In Table 7 we estimate whether the effeet of CAI is different for pre-algebra 
versus algebra or students of different ability as measured by baseline (NWEA) test seores.’* 
Each column of the table represents estimates of the effect of CAI for a different subset of the 
analysis sample. We present estimates for the three districts combined (column (1)), districts 1 
and 3 combined (column (2)), and district 1, 2, and 3 separately in columns (3), (4), and (5), 
respectively. The top panel estimates differential effects by pre-algebra and algebra and the 
bottom panel estimates the CAI effect by student ability as measured by the baseline test score 
quartile.’^ 

We have study students in algebra and pre-algebra classes in all three districts with 
roughly 23 percent in pre-algebra classes.^” Pooling all three districts we estimate that the effect 



We have conducted all of the subsequent analysis using the statewide tests rather than the 
NWEA pre-algebra and algebra test designed for this study. The biggest problem is that the sample 
sizes are much smaller generating results that are quite imprecise. However, many of them are 
qualitatively similar to those presented in the paper. These results are available from the authors on 
request. 

” We have also tested whether the effectiveness of CAI differs by sex or race/ethnicity and 
find no systematic differences. The results are available from the authors on request. 

Each column in each panel represents a separate regression. 

Test score quartiles for all specifications are defined within district and algebra level. All 
specifications additionally control for student demographic characteristics as described above and 
indicators for the randomization pool. The top panel also includes the baseline test score while the 
bottom panel includes, instead, indicators for the baseline test score quartile. We also include main 
effects for the level of math class in the top panel. We emphasize that these results are qualitatively 
similar when use growth-normed scores suggesting that they are not an artifact of the test score 
scaling and the possibility that students at different parts of the distribution would naturally have 
differential gains over the course of the year. 

In the analysis sample, 30 percent of District 1 students are in pre-algebra, 12 percent of 
District 2 students are in pre-algebra, and 9 percent of District 3 students are in pre-algebra. 
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of CAI for pre-algebra students is signifieantly larger than the effeet for algebra students (the p- 
value of the differenee between the two effeets equals 0.001). Pre-algebra students in CAI seore 
0.48 standard deviations higher than pre-algebra students in traditional elasses while algebra 
students in CAI seore less than 1 pereent of a standard deviation higher and the effeet is not 
statistieally different from zero. Note, however, that the effeet of CAI for algebra students is 
being driven toward zero by the negative effeet of CAI for algebra students in distriets 2 and 3. 
That said, even in Distriet 1 we find evidenee that CAI has a larger effeet among pre-algebra 
students than algebra students. In Distriet I we estimate that CAI pre-algebra students seore 
0.44 standard deviations higher than traditionally taught pre-algebra students while CAI algebra 
students seore only 0.13 standard deviations higher than traditionally taught algebra students. 
For eaeh distriet the p-value for the test that the pre-algebra effeet of CAI equals the algebra 
effeet of CAI is less than 0.07.^' Thus, this CAI treatment appears more effeetive for pre-algebra 
students than for algebra students. 

In the bottom panel we allow the effeet of CAI to differ by prior student math 
aehievement.^^ A promised benefit of CAI is that the instruetion is eompletely individualized in 
the sense that students ean move at their own paee in eovering the material. In eontrast, students 
in a traditional elassroom eover all lessons at the same paee. This eould mean that CAI is 

Statistieally, we ean rejeet that the effeetiveness of CAI for algebra students is the same 
in distriet 2 or 3 as in distriet 1 . The effeetiveness of CAI for pre-algebra students in distriet 2 is very 
similar to and not statistieally different from that in distriet 1 , and although the estimated CAI effeet 
for pre-algebra students in distriet 3 is larger than in distriet 1, we also eannot rejeet that it is same 
as in distriet 1 . 

In the bottom part of this table and in the subsequent tables we eombine pre-algebra and 
algebra students to inerease our statistieal power. The results are qualitatively similar if we limit 
the sample to pre-algebra students. Sueh results are available on request. 
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differentially effective for students of different math ability. For example, suppose traditional 
classroom teachers always teach pre-algebra and algebra at the pace that is appropriate for the 
highest ability students in the class. In this case, we might expect to see that high ability 
students do equally well in CAI and traditional classrooms while those with lower math ability 
do better in CAI because they can take more time to cover each lesson and therefore learn the 
material better even if they do not cover as many lessons. Alternatively, if traditional classroom 
teachers always teach pre-algebra and algebra at the pace that is appropriate for the lowest ability 
students then high ability students may do better in CAI because they can cover more material 
than covered in a traditional classroom. While a possibility, when we pool either all three 
districts (column (1)) or Districts 1 and 2 (column (2)) we estimate that CAI is roughly equally 
effective for students with the lowest and highest prior math achievement students (p- 
value>0.60). Thus, we find no evidence that CAI is more or less effective for students with 
stronger or weaker backgrounds in math as measured by the baseline algebra test. 

Tables 8a and 8b test for different CAI effects by attendance characteristics of individual 
students and for the class based on attendance data from the prior academic year. As noted 
earlier, we only have data on student attendance for Districts 2 and 3. While the pooled data 
suggest that, indeed, CAI is more effective for students with worse attendance rates we cannot 
reject that there are no differences at standard levels of significance. We find some statistically 
significant differences by attendance quartile using District 3 alone, but the pattern of results are 
not fully consistent with hypothesis that the individualized instruction of CAI mitigates the 
negative effects of poor attendance rates. 




29 



Table 8b presents estimates allowing the effeet of CAI to differ with the average 
attendanee rate of the students in the elassroom.^^ For Distriets 2 and 3 either pooled or 
individually we find a larger CAI effect for classrooms with lower average attendance rates. For 
students in a classroom with average attendance rates, the CAI effect is less than 6 percent of a 
standard deviation and not statistically different from zero. In contrast, the CAI effect for 
students in a classroom with attendance rates one standard deviation below the mean is 0.35 of a 
standard deviation (p-value equals 0.08). 

Next, we examine whether CAI is more effective for larger classes. Here we measure 
class size based on the initial class assignment rosters used for random assignment; thus, class 
size is available for all three districts. The average class sizes in these districts range from 24 to 
29 students. Pooling all three districts, we find that the CAI effect is larger for larger 
classrooms; unfortunately this marginal effect is not statistically significant at standard levels (p- 
value equals 0.19). However, pooling only Districts 1 and 2 we find that the CAI effect is about 
twice as large and statistically significant at the 10% level (the p-value is 0.067). Based on this 
estimate, for a classroom of 25 students the effect of CAI is 0.21 of a standard deviation (p-value 
< 0.001). For a class of 15 students there is no difference between CAI and traditional 
instruction (0.01 of a standard deviation with a p-value of 0.89). Class size effects are positive 
for District 1 (p-value = 0.09) and District 2 (p-value = 0.80), individually. The coefficient 
estimate is very small and negative with a large standard error in District 3. We cautiously 



For each student we calculate the average attendance rate of her classmates using 
attendance data for the prior year and excluding her own attendance rate from the calculation. 
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conclude there is some evidence CAI is more effective in larger classes, consistent with the idea 
that the main benefit of CAI is the individualization of the instruction. 

Finally, we examine whether CAI effects are larger in classrooms with greater 
heterogeneity in terms of baseline math achievement. Specifically, we allow the CAI effect to 
depend on the baseline test score standard deviation for the class. The top panel of Table 10 
presents overall results. While the estimate of the coefficient on the interaction term for District 
1 is negative, those for Districts 2 and 3 individually, are positive, consistent with the idea that 
the benefit of CAI is through individualized instruction. However, regardless of sample, none of 
the coefficients on the interaction between CAI and baseline standard deviation are statistically 
significant. 

One potential explanation for the results only being weakly supportive of the importanee 
of individualized instruction is that heterogeneity, in-and-of itself, may not hinder effective 
teaching. Rather, in certain circumstances - such as in small classrooms - heterogeneity in 
student ability may be quite manageable in a traditional classroom. In this case, the relative 
advantage of CAI (and hence more individualized instruction) may only become apparent in 
large and heterogenous classes. To test this hypothesis, in the second panel of Table 10 we add a 
third level interaction - that between CAI, the baseline standard deviation in student test scores, 
and an indicator for whether the class is “large” (defined as more than 24 students). We now 
find there is a large, statistically significant, relative advantage to being assigned to CAI for 



The results are robust to small ehanges in the definition of a large class. For example, the 
result is similar if we define large elasses as those with more than 20 students (the 30* pereentile 
based on classrooms), but they are not similar at the 60* percentile (more than 26 students). We also 
obtain qualitatively similar results when we define class size as a continuous variable. 
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large, heterogeneous elasses whieh is eonsistent with the hypothesis that CAl benefits primarily 
aeerue through inereased individualization of instruetion. 

V, Cost-Benefit Simulation 

Of eourse, gains from eomputerized instruetion do not eome for free as the eomputer labs 
required for CAl are eostly and are dedieated to CAl. In our example, a 30-seat lab eosts 
$100,000 with an additional $150,000 for pre-algebra, algebra, and elassroom management 
software and roughly $17,000 per year for training, support, and maintenanee of the lab.^^ 
Aeeording to the eompany’s website a lab lasts 7-10 years so a CAl lab may eost nearly $53,000 
per year.^*’ 

Given that providing instruetion through CAl may serve as a substitute for redueed elass 
sizes, one way to evaluate its eost effeetiveness is to eompare its eost to the eompensation eost of 
hiring additional teaehers to reduee elass size. Using pre-algebra/algebra test seores measured in 
national standard deviation units we find that a student in an average-sized elass (24 pupils) 
using CAl in our largest distriet (Distriet 1) seores 1 1 pereent of a standard deviation higher than 
a student in a similarly-sized traditional elassroom. Beeause the gains from CAl are larger for 
larger elasses, the benefit of CAl equals zero when the average elass size is redueed to 13 



Information on the eost of a CAl lab eomes from one of the distriets in our study. 

The eompany estimates the annual eost per pupil at just over $ 1 00. However, we ean only 
get elose to this per-pupil estimate if we assume that the lab would serve 400 students per year over 
a 7 year period and that the distriet would not pay for training, support, and maintenanee eost after 
the initial three years. We generate our own estimates beeause we believe this eost per pupil to be 
unrealistieally low. 
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students. Thus we eompare the per-pupil cost of CAI to the cost of reducing class sizes to 13 
students. 

Begin with an estimate of the cost of reducing class size using all of the schools in 
District 1 that are in our analysis sample.^’ The average class size for all District 1 classes 
represented in the study is 23.5. Although District 1 has eight periods per day, by contract 
teachers do not teach every period. The typical teacher in our sample teaches 6 periods. As a 
result, the District would have to hire about 24 more pre-algebra and algebra teachers to reduce 
the average class size to 13. Using an estimate of the starting salary for teachers in District 1, 
adjusted to reflect “total compensation,” we estimate that the cost of class size reduction would 
be $241 per pupil per year.^^ (See the Simulation Appendix and Appendix Table 3 for details.) 

The key determinants of whether CAI is more cost effective than class size reduction are 
the average number of students per class in the lab and the number of periods in the day a lab 
can be used. If the district implements CAI and keeps the average class size in the lab at 23.5 
students, the annual per pupil cost is about $279. Per pupil costs of CAI are lowest when the lab 
can be used every period of the day and each class has 30 pupils in it. If 30 students were 
assigned to classes in the lab, the per pupil cost decreases to about $218 which is slightly lower 



We only report estimates using the analysis sample in District 1 because we have a good 
understanding of the typical number of periods in each school; we must make more assumptions 
when we using our entire analysis sample. That said, the estimated annual cost per pupil of CAI 
would be about $274 using the entire analysis sample and the estimated cost of reducing class size 
to 13 students would be about $246. 

The cost of reducing class size in this simulation is much lower than the estimates of the 
cost of class size reduction for elementary schools as in Tennessee STAR (e.g., nearly $5000 per 
pupil in Schanzenbach 2006). This is primarily because when class sizes are reduced at the 
elementary school level, it is for all subjects, not just algebra and pre-algebra. 
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than the estimated eost of elass size reduetion. More generally, the per pupil cost of CAI is 
estimated to be less than or equal to the cost of class size reduction as long as the district 
increases the average class size in the lab to between 27 and 30 pupils. 

For individual schools in District 1 with larger average class sizes, our estimates of the 
cost of implementing CAI are less than our compensation cost estimates of reducing class size, 
even without increasing the average class size in the lab. For example. School B has an average 
class size of 26.8. In this case, cutting the average class size in half costs roughly $278 per pupil 
compared to $245 per pupil to implement CAI without changing the average class size. The 
benefits of CAI are the most attractive in School A where the cost of reducing class sizes is over 
$100 more per student than that of adopting CAI. 

In general our calculations suggest that the costs of reducing pre-algebra and algebra 
classes to 13 students and adopting CAI are quite comparable. However, we suspect that our 
estimates of the cost of class size reduction are more severely underestimated compared to those 
for CAI. The reason is that they only reflect increased costs in terms of teacher compensation 
while, in fact, there would likely be additional costs such as recruiting costs and capital 
expenditures that have not been taken into account. As a result, CAI may be the more cost- 
effective way for school districts to raise mathematics achievement. Furthermore, in urban and 
rural districts that have difficulty hiring highly qualified mathematics teachers, CAI may be 
much easier to implement than a drastic reduction in class size. 



VI, Conclusion 
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Our results suggest that CAI may inerease student aehievement in pre-algebra and 
algebra by at least 0.17 of a standard deviation, on average, with somewhat larger effeets for 
students in larger elasses. Put differently, students learning pre-algebra and algebra through CAI 
are 26% of a sehool year ahead of their elassmates in traditional elassrooms after one year. In 
interpreting these results, one must keep in mind that the outeomes were measured relatively 
soon after the intervention ended sueh that we do not know how long they would “last.” At the 
same time, it is not elear to us how one might measure sueh longer run outeomes, partieularly 
sinee mathematies is not neeessarily eumulative at the seeondary sehool level, students in the 
eontrol group may go on to use CAI, and all of the students may have been involved in other 
enriehment programs. In addition, this represents only one use of eomputers for teaehing pre- 
algebra and algebra and not all CAI hardware and software may be equally effeetive. That said, 
this study suggests that CAI has the potential to signifieantly enhanee student mathematies 
aehievement in middle and high sehool, that the gains are eomparable to those aehieved with 
drastie elass size reduetion, and that the eosts are likely somewhat lower than the full eost of 
redueing the average elass size for all algebra and pre-algebra elasses. At the very least, our 
results suggest that CAI deserves additional rigorous evaluation and poliey attention, partieularly 
sinee it may be mueh easier for sehools and distriets to implement than large seale elass size 



reduetion. 
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Simulation Appendix 

In this appendix we present more detailed information on the cost calculations for CAI 
and class size reduction using information on all algebra and pre-algebra classes for two schools 
in District 1. We also present the same calculations for all District 1 algebra and pre-algebra 
classes in the analysis sample.^’ Thus the top panel of the table presents cost estimates for 
implementing CAI while the bottom panel presents cost estimates for reducing class size to 13 
students. The cost estimates vary because of differences across the schools in the average class 
size. 

The first three columns are identical in each panel and represent the total number of pre- 
algebra and algebra classes, total number of students, and the average class size, respectively. 
Column (4) lists the number of periods the lab is in use (top panel) or the teacher is teaching 
(bottom panel). For CAI we assume that the average class size is equal to the observed average 
class size or a maximum of 30 students (column (5) in the top panel). For class size reduction, 
we assume that classes are reduced to 13 students. Column (5) in the bottom panel equals the 
total number of new classes required to generate an average class size of 13. Column (6) then 
presents the number of labs the school (district) needs to put all algebra and pre-algebra classes 
in CAI (top panel) or the number of additional teachers needed to reduce algebra and pre-algebra 
class size to 13 given the assumption that the new teachers teach for 6 of the 8 periods in the day. 
Finally, we assume the lab involves a fixed cost of $250,000 for hardware and software and 
$50,000 for 3 years of support, training, and maintenance and that the lab is good for 7 years. For 

As noted in the text, we only present results using the analysis sample in District 1 because 
we have specifics about the structure of the school day. To use the entire analysis sample we must 
make more assumptions. 
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the compensation cost of each teacher we use the salary of a new teacher in district 1 with zero 
years of experience and further assume that salary is 70 percent of the total compensation cost. 

For a large school in our sample (School A), the cost of CAI is $218 per pupil compared 
to $329 per pupil to reduce class size to 13 students. For a smaller school in our sample (School 
B), the cost per pupil is roughly $245 for CAI compared to $278 for class size reduction. The 
final row in each panel presents cost estimates using information for all algebra and pre-algebra 
classes in District 1 that are represented in the analysis sample.""’ In this case, our per pupil cost 
of CAI is nearly $280 compared to a per pupil cost of reducing class size that is closer to $240. 

When we consider the analysis sample for all three districts, we assume that teachers 
typically teach 6 out of a total of 8 class periods during the day in all three districts and that 
teacher salaries are the same as in District 1 . Thus, since the average class size for all classes in 
the analysis sample (23.9) is quite similar to the average for District 1 classes (23.5), the 
estimates of the cost of CAI and the cost of class size reduction are quite similar to the estimates 
for District 1, $274 per pupil for CAI and $246 per pupil for class size reduction. This is likely 
an overestimate for CAI and an under estimate for class size reduction. For some of the schools 
in districts 2 and 3, it appears that teachers may actually teach fewer than 6 classes per day, and 
some schools may actually have more than 8 possible periods during the day. Also, teacher 
salaries may be somewhat higher in District 2 than in Districts 1 and 3. 



Most of schools in District 1 operate on a block schedule; however, classes could be 
organized either in 4 blocks for 1 semester or 8 periods over 1 year. For simplicity we assume 
classes are organized into 8 periods over 1 year for all schools. 
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Table 1: Districts in Study Compared to National Average 



United States 
100 Largest 
Districts 


3 Districts 
Combined 


District 1 


District 2 


District 3 


Average # of students in a 


112,807 


-63,000 


-68,000 


-22,000 


-97,000 


district (all grades) 












% Female 


48.8 


49.4 


49.7 


48.8 


49.3 


% African American 


28.1 


69.5 


93.6 


40.3 


59.4 


% Hispanic 


34.1 


16.2 


1.1 


54.3 


18.0 


% Native American 


0.6 


0.5 


0.1 


0.1 


0.9 


% Asian 


7.1 


3.1 


1.9 


0.8 


4.4 



Source: Authors’ calculations based on the National Center for Education Statistics Common Core of Data, 2003-2004 school year, 
100 largest districts by total enrollment. Percentages are based only on schools reporting. (Data on sex are missing for Knox County, 
Memphis City, Nashville-Davidson County, Philadelphia City, Portland, and Shelby County School Districts. Data on race and 
ethnicity are missing for Memphis City, Nashville-Davidson County, and Shelby County School Districts.) Demographic 
characteristics for the 3 districts combined are enrollment-weighted averages of the individual district means. 




41 



Table 2: Schools and Students in Study Compared to the Overall District Averages 







District 1 






District 2 






District 3 






Relevant 

Schools 


Schools 
in Study 


Students 
in Study 


Relevant 

Schools 


Schools 
in Study 


Students 
in Study 


Relevant 

Schools 


Schools 
in Study 


Students 
in Study 


number of students 


29,603 


8,148 


973 


5,270 


4,476 


412 


27,572 


3,540 


200 


students per school 


604 


815 


97 


659 


1119 


103 


484 


1180 


67 


% grade 8 


19.3 


16.8 


40.4 


2.3 


0.0 


0.0 


1.4 


0.0 


3.5 


% grade 9 


18.0 


18.3 


47.2 


38.0 


40.0 


52.7 


35.6 


40.0 


91.5 


% grade 10 


15.1 


17.8 


9.9 


22.0 


23.2 


31.8 


23.3 


25.1 


3.0 


% female 


50.5 


49.0 


52.0 


48.4 


48.2 


46.7 


49.9 


47.6 


47.7 


% African American 


94.2 


97.2 


87.8 


43.6 


42.0 


47.1 


61.1 


92.5 


94.5 


% Hispanic 


1.0 


0.8 


0.8 


50.1 


51.2 


44.7 


15.2 


1.2 


0.5 


% white 


2.6 


0.4 


0.1 


5.5 


5.9 


6.6 


18.3 


4.0 


1.5 


% Native American 


<0.1 


<0.1 


0.0 


0.2 


0.1 


0.2 


1.1 


0.4 


0.0 


% Asian 


2.2 


1.6 


1.8 


0.7 


0.8 


0.5 


4.5 


1.9 


3.0 


% missing 
demographic data 






9.6 






0.2 






0.5 



Source: Authors’ calculations based on the National Center for Education Statistics. Common Core of Data, 2003-2004 school year. 
There are 49 “relevant” schools in District 1, 8 in District 2, and 57 in District 3. Relevant schools in District 1 are defined as schools 
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in the CCD with a level of middle sehool, high sehool, or other; relevant sehools in Distriet 2 
and Distriet 3 have a level of high sehool or other. We drop middle sehools in Distriet 1 for 
whieh the highest grade offered is less than grade 8. There are 10 sehools in the study in Distriet 
1, 4 sehools in Distriet 2, and 3 sehools in Distriet 3. Charaeteristies on the students in the study 
eome from data made available to the authors by the sehool distriets. 
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Table 3: Randomization of Treatment and Control Using Full and Analysis Samples 



Random Assignment 



Traditional 


Computer-Assisted 


p-value of 


Instruction 


Instruction 


difference 


Full Sample 



Baseline algebra test score 


24.7 


24.7 


0.494 


Percent female 


47.2 


47.1 


0.637 


Percent African American 


80.0 


83.2 


0.561 


Percent Hispanic 


15.9 


13.5 


0.195 


Class size 


25.8 


25.7 


0.860 


Number of Observations 


1133 


1145 








Analysis Sample 




Baseline algebra test score 


24.7 


24.8 


0.304 


Percent female 


51.1 


48.9 


0.148 


Percent African American 


81.9 


84.0 


0.060 


Percent Hispanic 


13.8 


12.1 


0.061 


Class size 


25.8 


26.2 


0.549 


Number of Observations 


785 


800 





Notes: All test scores are scaled scores converted to standard deviation units. The test for a 
difference in mean characteristic by random assignment is based on a regression of the 
characteristic on an indicator for random assignment and randomization pool fixed effects 
allowing for correlation in standard errors at the classroom level. We report the p-value for the t- 
test that the coefficient on the random assignment indicator equals zero. 






44 



Table 4a: Ordinary Least Squares and Instrumental Variable Estimates 
of tbe Effect of Computer-Assisted Instruction (CAI) on Algebra Acbievement 

(without Teacber Fixed Effects) 







OES 






IV 


(1) 


(2) 


(3) 


(4) 


(5) 


(6) 



CAI 


0.173 

(0.076) 


0.172 

(0.074) 


0.212 

(0.077) 


0.172 

(0.060) 


0.173 

(0.059) 


0.249 

(0.086) 


Baseline algebra test 
score 








0.500 

(0.035) 


0.493 

(0.034) 


0.491 

(0.034) 


Eemale 




0.081 

(0.044) 






0.095 

(0.041) 


0.087 

(0.041) 


African American 




-0.671 

(0.180) 






-0.506 

(0.137) 


-0.498 

(0.138) 


Hispanic 




-0.540 

(0.211) 






-0.390 

(0.159) 


-0.370 

(0.159) 


Observations 


1872 


1872 


1585 


1585 


1585 


1585 



Notes: Each column represents a separate regression. Test scores are scaled scores converted to 
standard deviation units. Each regression also controls for the randomization pool as well as an 
indicator equal to one if sex is missing and an indicator equal to 1 if race/ethnicity is missing for 
those regressions that include demographic information. Eor the IV estimates of the effect of 
treatment on the treated we define treatment as completing at least one lesson in computerized 
algebra instruction. We report standard errors that allow for correlation within classroom in 
parentheses. 
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Table 4b: Ordinary Least Squares and Instrumental Variable Estimates 
of tbe Effect of Computer-Assisted Instruction (CAI) on Algebra Acbievement 

(with Teacber Fixed Effects) 







OES 






IV 


(1) 


(2) 


(3) 


(4) 


(5) 


(6) 



CAI 


0.373 

(0.071) 


0.367 

(0.067) 


0.423 

(0.074) 


0.284 

(0.053) 


0.283 

(0.053) 


0.417 

(0.080) 


Baseline algebra test 
score 








0.483 

(0.035) 


0.477 

(0.034) 


0.468 

(0.034) 


Eemale 




0.108 

(0.041) 






0.125 

(0.041) 


0.115 

(0.041) 


African American 




-0.619 

(0.155) 






-0.449 

(0.129) 


-0.433 

(0.131) 


Hispanic 




-0.498 

(0.185) 






-0.351 

(0.154) 


-0.315 

(0.152) 


Observations 


1872 


1872 


1585 


1585 


1585 


1585 



Notes: Each column represents a separate regression. Test scores are scaled scores converted to 
standard deviation units. Each regression also controls for the randomization pool, an indicator 
equal to one if sex is missing, and an indicator equal to 1 if race/ethnicity is missing for those 
regressions that include demographic information, and teacher fixed effects. Eor the IV estimates 
of the effect of treatment on the treated we define treatment as completing at least one lesson in 
computerized algebra instruction. We report standard errors that allow for correlation within 
classroom in the parentheses. The p-values of the E-tests on the statistical significance of the 
teacher effects equal zero for all specifications. 
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Table 5: Amount of Time in the Computer Lab 
by the Random Assignment of the Student’s Class 



Random Assignment 



Traditional 




Instruction 


CAI 



Number of lessons students are expected 


52.7 


55.3 


to complete based on the course level 


(14.4) 


(15.3) 


Percent of students completing no lessons 


80.1 


9.1 


in CAI 


(39.9) 


(28.8) 


Percent of students completing more than 


14.8 


83.8 


10 lessons in CAI 


(35.5) 


(36.9) 


Percent of students completing more than 


10.3 


70.3 


20 lessons in CAI 


(30.4) 


(45.7) 


Number of lessons completed in CAI 


5.6 


33.0 




(15.2) 


(23.9) 


Number of CAI lessons completed as a 


10.0 


64.5 


percent of course expectations 


(27.8) 


(50.4) 


Number of observations 


785 


800 



Notes: District 1 has 62 school days in the study while classes in districts 2 and 3 generally have 
180 days in the study. One exception is that a few classes in district 3 meet only one-half of the 
schools days. 
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Table 6a: Ordinary Least Squares Estimates of the Effect of 
Computer-Assisted Instruction (CAI) on Algebra and Mathematics Achievement 

in District 1 Using Different Tests 





Algebra 
Scale Score 


2"“* Qtr 
Benchmark 
Algebra Test 


3'“* Qtr 
Benchmark 
Algebra Test 


State 

Mathematics 

Test 




(1) 


(2) 


( 3 ) 


( 4 ) 






Maximum available sample 




CAI 


0.226 

(0.071) 


0.381 

(0.127) 


0.604 

(0.286) 


0.260 

(0.119) 


Observations 


973 


230 


239 


454 




Constraining sample students to be the same across 
specification 


CAI 


0.374 

(0.168) 


0.462 

(0.173) 


0.946 

(0.482) 


0.381 

(0.139) 


Observations 


185 


185 


185 


185 



Notes: Standard errors that allow for correlation within classroom are in parentheses. The 
dependent variable in the first column is the normalized scale score for the algebra test; that in 
the second column is the quarter district-wide 8*-grade math test score; that in the third 
column is the 3'^'* quarter district-wide S^-grade math test score; and that in the fourth column is 
the state mathematics test. All test scores are scale scores converted to standard deviation units. 
Each regression also includes controls for baseline test scores, the randomization pool, 
demographic characteristics, and an indicator equal to one if sex is missing, and an indicator 
equal to 1 if race/ethnicity is missing. The algebra and state mathematics tests were administered 
in the spring. The baseline algebra tests were given in the beginning of the academic year. The 
baseline benchmark algebra test was given in the E‘ quarter of the academic year. The baseline 
state test was given in the spring of the preceding academic year. 
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Table 6b: Ordinary Least Squares Estimates of tbe Effect of 
Computer-Assisted Instruction (CAI) on Algebra and Mathematics Acbievement 
in Districts 2 and 3 Using Different Tests 





Distriet 2 


Distriet 3 






State 




State 




Algebra 
Seale Seore 


Mathematies 

Test 


Algebra Mathematies 
Seale Seore Test 




(1) 


(2) 


( 3 ) 


( 4 ) 






Maximum sample available 




CAI 


0.200 


0.089 


-0.124 


-0.062 




(0.130) 


(0.094) 


(0.122) 


(0.118) 


Observations 


412 


341 


200 


199 




Constraining sample students to be the same aeross 
speeifieation within sehool distriet 


CAI 


0.400 


0.082 


0.031 


-0.202 




(0.171) 


(0.112) 


(0.182) 


(0.109) 


Observations 


229 


229 


107 


107 



Notes: Standard errors that allow for eorrelation within elassroom are in parentheses. The 
dependent variable in the first and third eolumns is the normalized seale seore for the algebra 
test; those in the seeond and fourth eolumn results are the respeetive state mathematies test. All 
test seores are seale seores eonverted to standard deviation units. Eaeh regression also ineludes 
eontrols for baseline test seores, the randomization pool, demographie eharaeteristies, and an 
indieator equal to one if sex is missing, and an indieator equal to 1 if raee/ethnieity is missing. 
The algebra tests were administered in the spring. The baseline algebra tests were given in the 
beginning of the fall. For distriet 2 the state mathematies test was administered in the spring of 
the students’ 10* grade year. For distriet 3 the state mathematies test was administered in the fall 
of the students 10* grade year. For both distriets, the baseline state tests were given in the fall of 
the students’ 8* grade year. 




Table 7: Differential Intent to Treat Effects of the Computerized Instruction on Pre-Algebra and Algebra Achievement 

by Class Type and Baseline Test Score Quartile 
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CAI effect for Algebra 
CAI effect for pre-Algebra 


All 3 Districts 


Districts 1 and 2 


District 1 


District 2 


District 3 


(1) 


(2) 


(3) 


(4) 


(5) 


0.005 

(0.059) 

0.481 

(0.119) 


0.069 

(0.065) 

0.453 

(0.120) 


0.130 

(0.066) 

0.442 

(0.155) 


-0.307 

(0.218) 

0.513 

(0.187) 


-0.230 

(0.100) 

1.360 

(0.690) 


CAI effect for bottom baseline test 


0.216 


0.288 


0.280 


0.136 


-0.199 


score quartile 


(0.091) 


(0.095) 


(0.095) 


(0.235) 


(0.287) 


CAI effect for 2nd baseline test 


0.242 


0.273 


0.343 


0.150 


-0.090 


score quartile 


(0.100) 


(0.104) 


(0.115) 


(0.212) 


(0.282) 


CAI effect for 3"“* baseline test 


0.171 


0.199 


0.090 


0.522 


0.004 


score quartile 


(0.105) 


(0.117) 


(0.125) 


(0.260) 


(0.161) 


CAI effect for top quartile 


0.155 


0.245 


0.218 


0.358 


-0.436 




(0.106) 


(0.112) 


(0.124) 


(0.237) 


(0.259) 


Number of observations 


1585 


1385 


973 


412 


200 



Notes: Each column of each panel represents a separate regression. All test scores are scale scores converted to standard deviation 
units. Regressions in the top panel also includes baseline test scores. Each regression also controls for the randomization pool, 
demographic characteristics, an indicator equal to one if sex is missing, and an indicator equal to 1 if race/ethnicity is missing. 
Baseline test score quartiles are defined within district and class type (algebra or pre-algebra). We report standard errors that allow for 
correlation within classroom in the parentheses. 
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Table 8a: Differential Intent to Treat Effects of the Computerized Instruction 
on Pre- Algebra and Algebra Achievement 
by Individual Attendance Rates 





Districts 2 and 3 


District 2 


District 3 




(1) 


(2) 


( 3 ) 


CAI effect for bottom 


0.439 


0.112 


0.797 


baseline attendance quartile 


(0.287) 


(0.395) 


(0.353) 


CAI effect for 2nd baseline 


-0.221 


-0.136 


-0.578 


attendance quartile 


(0.208) 


(0.328) 


(0.267) 


CAI effect for 3 "'^ baseline 


-0.051 


-0.053 


-0.068 


attendance quartile 


(0.175) 


(0.256) 


(0.293) 


CAI effect for top baseline 


-0.020 


-0.119 


0.146 


attendance quartile 


(0.197) 


(0.313) 


(0.255) 


Number of observations 


372 


221 


151 



Notes: Each column and panel represents a separate regression. Test scores are scaled scores converted to standard deviation units. 
Each regression also controls for the randomization pool, the baseline test scores, demographic characteristics, an indicator equal to 
one if sex is missing, and an indicator equal to 1 if race/ethnicity is missing. We report standard errors that allow for correlation 
within classroom in the parentheses. Each student’s attendance rate is calculated as the percent of enrolled days that the student is in 
attendance. Attendance quartiles are calculated within district. 




Table 8b: Differential Intent to Treat Effects of tbe Computerized Instruction 
on Pre-Algebra and Algebra Acbievement 
by Class Cbaracteristic: Attendance 
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CAl 

CAl X Average class 
attendance 


District 2 and 
District 3 


District 2 


District 3 


2.131 

(1.017) 

-0.025 

(0.012) 


2.261 

(1.220) 

-0.025 

(0.014) 


2.808 

(1.970) 

-0.034 

(0.022) 


Mean (std. deviation) of 


83.287 


82.513 


84.695 


class attendance rate 


(11.803) 


(13.605) 


(7.307) 


Number of observations 


564 


364 


200 



Notes: See notes for table 9a. Average class attendance is based on individual student attendance 
data for the year preceding the year of the experiment. 
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Table 9: Differential Intent to Treat Effects of the Computerized Instruction 
on Pre- Algebra and Algebra Achievement by Class Size 



All 3 Districts 


Districts 1 and 2 


District 1 


District 2 


District 3 


(1) 


(2) 


(3) 


(4) 


(5) 



CAI 


-0.097 


-0.281 


-0.266 


-0.035 


-0.033 




(0.215) 


(0.254) 


(0.250) 


(0.920) 


(0.694) 


CAI X Class size 


0.010 


0.020 


0.019 


0.011 


-0.004 




(0.008) 


(0.011) 


(0.011) 


(0.042) 


(0.022) 


Mean class size 


26.005 


25.623 


26.420 


23.740 


28.650 


(standard deviation) 


(6.623) 


(6.122) 


(6.330) 


(5.135) 


(8.976) 


Number of observations 


1585 


1385 


973 


412 


200 



Notes: Each column represents a separate regression. Test scores are scaled scores converted to standard deviation units. Each 
regression also controls for the randomization pool, the baseline test scores, demographic characteristics, an indicator equal to one if 
sex is missing, and an indicator equal to 1 if race/ethnicity is missing. We report standard errors that allow for correlation within 
classroom in the parentheses. 
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Table 10: Differential Intent to Treat Effects of the Computerized Instruction 
on Pre- Algebra and Algebra Achievement by Class Baseline Test Score Standard Deviation 





All 3 Districts 


Districts 1 and 2 


District 1 


District 2 


District 3 




(1) 


(2) 


(3) 


(4) 


(5) 


CAI X baseline standard deviation 


0.110 


-0.064 


-0.118 


0.600 


0.583 


for the class 


(0.391) 


(0.387) 


(0.443) 


(0.921) 


(0.934) 




(6) 


(V) 


(8) 


(9) 


(10) 


CAI X baseline standard deviation 


-1.100 


-0.620 


-0.559 


-0.514 


-3.804 


for the class 


(0.560 


(0.529) 


(0.594) 


(1.352) 


(0.369) 


CAI X class baseline standard 


1.512 


0.688 


0.485 


9.257 


4.136 


deviation x I(large class) 


(0.892) 


(0.870) 


(0.907) 


(2.085) 


(0.711) 


Mean class baseline standard 


0.781 


0.782 


0.774 


0.802 


0.773 


deviation (standard deviation) 


(0.160) 


(0.154) 


(0.157) 


(0.147) 


(0.193) 


Number of observations 


1585 


1385 


973 


412 


200 



Notes: See notes for table 10. The eoeffieients in top and bottom panels are from different specilieations. The median elass size in the 
overall sample is 24 students. A large elass is defined as having more than 24 students. A small elass is defined as having 24 or fewer 
students. 
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Appendix Table 1: Numbers of Scbools Classes, Teachers, and Randomization Pools 





Combined 


District 1 


District 2 


District 3 




Full Sample 




Number of schools 


17 


10 


4 


3 


Number of randomization pools 


60 


31 


19 


10 


Number of classes 


151 


81 


46 


24 


Number of teachers 


61 


39 


15 


7 


Number of students 


3541 


1870 


1062 


609 






Analysis Sample 




Number of schools 


17 


10 


4 


3 


Number of randomization pools 


60 


31 


19 


10 


Number of classes 


141 


74 


44 


23 


Number of teachers 


57 


36 


14 


7 


Number of students 


1585 


973 


412 


200 
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Appendix Table 2a: Randomization of Treatment and Control (Using Full Sample) 



Random Assignment 

Traditional Computerized p-value of 

Instruetion Instruetion differenee 

Distriet #1 



Baseline algebra test seore 


24.6 


24.7 


0.285 


Baseline state test seore 


9.2 


9.2 


0.990 


Baseline distriet test seore 


3.0 


3.7 


0.107 


Female 


51.5 


47.8 


0.128 


Afriean Ameriean 


98.0 


97.8 


0.260 


Hispanie 


0.6 


0.8 


0.821 


Class size 


25.3 


25.5 


0.949 


Distriet #2 


Baseline algebra test seore 


24.6 


24.7 


0.823 


Baseline state test seore 


6.6 


6.7 


0.558 


Female 


43.9 


44.8 


0.561 


Afriean American 


51.3 


44.8 


0.566 


Hispanic 


42.6 


48.1 


0.204 


Class size 


24.1 


24.6 


0.369 


District #3 


Baseline algebra test score 


25.0 


24.9 


0.904 


Baseline state test score 


16.7 


16.7 


0.992 


Female 


43.2 


48.2 


0.482 


African American 


92.7 


95.6 


0.126 


Hispanic 


0.7 


0.8 


0.792 
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Class size 



30.4 28.0 



0.547 



Notes: All test scores are scaled scores converted to standard deviation units. The test for a 
difference in mean characteristic by random assignment is based on a regression of the 
characteristic on an indicator for random assignment and randomization pool fixed effects 
allowing for correlation in standard errors at the classroom level. We report the p-value for the t- 
test that the coefficient on the random assignment indicator equals zero. For district #1: baseline 
algebra test scores are available for 700 treatment students and 624 controls; baseline state test 
scores are available for 474 treatment students and 387 controls; baseline district test scores are 
available for 110 treatment students and 147 controls; and demographic data are available for 
831 treatment students and 689 controls. For district #2: baseline algebra test scores are available 
for 280 treatment students and 351 controls; baseline state test scores are available for 243 
treatment students and 348 controls; and demographic data are available for 397 treatment 
students and 556 controls. For district #3: baseline algebra test scores are available for 165 
treatment students and 158 controls; baseline state test scores are available for 151 treatment 
students and 172 controls; and demographic data are available for 249 treatment students and 
287 controls. 
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Appendix Table 2b: 

Assessing Random Assignment with tbe Analysis Sample 





Random Assignment 






Traditional 


Computerized 


p-value of 




Instruction 


Instruction 


difference 


District #1 


Baseline algebra test score 


24.7 


24.7 


0.487 


Baseline state test score 


9.3 


9.5 


0.854 


Baseline district test score 


3.2 


3.7 


0.093 


Female 


53.6 


50.6 


0.092 


African American 


96.9 


97.2 


0.239 


Hispanic 


0.7 


1. 1 


0.977 


Class size 


26.0 


26.8 


0.481 


District #2 


Baseline algebra test score 


24.7 


25.0 


0.274 


Baseline state test score 


6.7 


6.9 


0.200 


Female 


48.0 


45.1 


0.634 


African American 


49.3 


44.5 


0.061 


Hispanic 


43.7 


46.2 


0.054 


Class size 


23.5 


24.0 


0.353 


District #3 


Baseline algebra test score 


25.1 


25.0 


0.320 


Baseline state test score 


16.9 


16.8 


0.437 


Female 


48.0 


47.5 


0.808 


African American 


94.0 


94.9 


0.290 


Hispanic 


0.0 


I.O 


0.I6I 



Class size 



30.2 



27.1 



0.462 





Appendix Table 3: 
Cost Comparisons 
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The cost of CAI 


School 


Number 

of 

Classes 


Total 

number of 
Students 


Class 

size 


Periods 


CAI class 
size 


CAI labs 
needed 


Annual 
cost per 
lab 


Cost per 
student 




(1) 


(2) 


( 3 ) 


( 4 ) 


( 5 ) 


(6) 


(V) 


(8) 


School A 


22 


730 


33.2 


8 


30.0 


3.0 


$52,381 


$218 


School B 


12 


321 


26.8 


8 


26.8 


1.5 


$52,381 


$245 


District 1 analysis sample 


74 


1736 


23.5 


8 


23.5 


9.3 


$52,381 


$279 


The cost of reducing class size to 13 students 


School 


Number 

of 

Classes 


Total 
number of 
Students 


Class 

size 


Periods 


New total 
math 
classes 


New 

teachers 

required 


Salary + 
benefits 
per teacher 


Cost per 
student 




(1) 


(2) 


( 3 ) 


( 4 ) 


( 5 ) 


(6) 


(V) 


(8) 


School A 


22 


730 


33.2 


6 


56.2 


5.7 


$42,143 


$329 


School B 


12 


321 


26.8 


6 


24.7 


2.1 


$42,143 


$278 


District 1 analysis sample 


74 


1736 


23.5 


6 


133.5 


9.9 


$42,143 


$241 



Notes: The information on number of classes and number of students for schools A and B apply to all algebra and pre-algebra classes 
in the school while the information on the number of classes and students for the analysis samples only applies to classes that are 
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represented in our analysis sample. The number of CAI labs needed equals the total number of students divided by the number of 
students eaeh lab serves eaeh day. We assume that the eomputer lab ean be used for the number of periods speeified in eolumn (4) of 
the top panel and that eaeh CAI elass is equal to average elass size with a maximum of 30 students (eolumn 5). We assume the eost of 
the lab equals $250,000 in fixed eosts plus $50,000 every 3 years for training, support, and maintenanee and that the lab will be good 
for 7 years. New total math elasses in eolumn (5) of the bottom panel equals the number of math elasses needed for an average elass 
size of 13 students. Assuming eaeh teaeher teaehes the number of periods in eolumn (4), eolumn (6) represents the number of new 
teaehers needed to reduee elass size to 13 students. Salary is based on the salary schedule for teachers in district 1 with no experience. 
We assume that salary equals 70 percent of total compensation costs. 
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